1National Engineering and Technology Center for Information Agriculture (NETCIA), MARA Key Laboratory of Crop System Analysis and Decision Making, MOE Engineering Research Center of Smart Agriculture, Jiangsu Key Laboratory for Information Agriculture, Institute of Smart Agriculture, Nanjing Agricultural University, Nanjing, Jiangsu, China.
2Zhongshan Biological Breeding Laboratory,Nanjing, China.
3Provincial Key Laboratory of Agrobiology, Institute of Germplasm Resources and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu, China.
4These authors contributed equally to this work
Received 21 Feb 2024 |
Accepted 18 May 2024 |
Published 08 Jul 2024 |
Efficient and accurate acquisition of the rice grain protein content (GPC) is important for selecting high-quality rice varieties, and remote sensing technology is an attractive potential method for this task. However, the majority of multispectral sensors are poor predictors of GPC due to their broad spectral bands. Hyperspectral technology provides a new analytical technology for bridging the gap between phenomics and genomics. However, the small size of typical datasets is a constraint for model construction for estimating GPC, limiting their accuracy and reducing their ability to generalize to a wide range of varieties. In this study, we used hyperspectral data of rice grains from 515 japonica varieties and deep convolution generative adversarial networks (DCGANs) to generate simulated data to improve the model accuracy. Features sensitive to GPC were extracted after applying a continuous wavelet transform (CWT), and the estimated GPC model was constructed by partial least squares regression (PLSR). Finally, a genome-wide association study (GWAS) was applied to the measured and generated datasets to detect GPC loci. The results demonstrated that the simulated GPC values generated after 8,000 epochs were closest to the measured values. The wavelet feature (WF1743, 2), obtained from the data with the addition of 200 simulated samples, exhibited the highest GPC estimation accuracy (R2 = 0.58 and RRMSE = 6.70%). The GWAS analysis showed that the estimated values based on the simulated data detected the same loci as the measured values, including the OsmtSSB1L gene related to grain storage protein. This study provides a new technique for the efficient genetic study of phenotypic traits in rice based on hyperspectral technology.